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We describe a hybrid hardware emulation environment: the Flexible Architecture for 
Simulation and Testing (FAST). FAST integrates field-programmable gate arrays (FPGAs), 
microprocessors, and memory to enable rapid prototyping of chip multiprocessors, 
multithreaded architectures, or other novel computer architectures and chip-level memory 
systems. FAST combines configurable and fixed-function hardware and software to 
facilitate rapid prototyping by utilizing components optimized for their partlcu ... 



2 Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience U 
H. Krupnova 

February 2004 Proceedings of the conference on Design, automation and test in 
Europe - Volume 2 DATE '04 

Publisher: IEEE Computer Society 

Full text available: 'Q pdf(159.14 KB) Additional Information: full citation , abstract , index terms 

Today, having a fast hardware platform for SoC software development prior to silicon is 
an important challenge to gain the time-to-market. The FPGAs offer an excellent 
prototyping basis for building hardware platforms since more than ten years ([1]). 
However, as the circuit complexity increases and project time-frames shrink, building a 
multi-FPGA prototype represents a real challenge from the complexity viewpoint. The 
paper describes the state-of-the-art mapping methodology, prototyping tools a ... 

^ Exploiting FPGA-features durin g the emulat ion of a fast reactiv e embedded system jj| 
A. Kariheinz WelB, Thorsten Steckstor, Gemot Koch, Wolfgang Rosenstiel 

February 1999 Proceedings of the 1999 ACM/SIGDA seventh international symposium 
on Field programmable gate arrays FPGA '99 

Publisher: ACM Press 
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4 Balancin g perform ance and flexibilit y with hardwa^^ support for network architectures Q 
^ Itija Hadzic, Jonathan M. Smith 

November 2003 ACM Transactions on Computer Systems (TOGS), volume 21 issue 4 

Publisher: ACM Press 

Full text available: 'g.effillMiKB) Additional Information: full citation , abstract , references , index terms 

The goals of performance and flexibility are often at odds In the design of network 
systems. The tension is common enough to justify an architectural solution, rather than a 
set of context-specific solutions. The Programmable Protocol Processing Pipeline (P4) 
design uses programmable hardware to selectively accelerate protocol processing 
functions. A set of field-programmable gate arrays (FPGAs) and an associated library of 
network processing modules implemented in hardware are augmented with so ... 

Keywords: FPGA, P4, computer networking, flexibility, hardware, performance, 
programmable logic devices, programmable networks, protocol processing 



5 Power esti mation approach for SRAM-based FPGAs | 
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on Field programmable gate arrays FPGA '00 
Publisher: ACM Press 
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This paper presents the power consumption estimation for the novel Virtex architecture. 
Due to the fact that the XC4000 and the Virtex core architecture are very similar, we used 
the basic approaches for the XC4000-FPGAs power consumption estimation and extended 
that method for the new Virtex family. We determined an appropriate technology- 
dependent power factor Kp to calculate the power consumption on Virtex-chips, and 
developed a special benchmark test design to condu ... 

A transaction-based unified simulation/emulation architecture for functional | 
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A transaction-based layered architecture providing for 100% portability of a C-based 
testbench between simulation and emulation is proposed. Transaction-based 
communication results in performance which is commensurate with emulation without a 
hardware target. Testbench portability eliminates duplicated effort when combining 
system level simulation and emulation. An Implementation based on the IKOS VStation 
emulator validates these architectural claims on real designs. 
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Philip Koopman, Howie Choset, Rajeev Gandhi, Bruce Krogh, Diana Marculescu, Priya 
Narasimhan, Joann M. Paul, Ragunathan Rajkumar, Daniel Slewlorek, Asim Smailagic, Peter 
Steenkiste, Donald E. Thomas, Chenxi Wang 

August 2005 ACM Transactions on Embedded Computing Systems (TECS), volume 4 issue 
3 

Publisher: ACM Press 
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Embedded systems encompass a wide range of applications, technologies, and disciplines. 
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necessitating a broad approach to education. We describe embedded system coursework 
during ttie first 4 years of university education (the U.S. undergraduate level). Embedded 
application curriculum areas include: small and single-microcontroller applications, control 
systems, distributed embedded control, system-on-chip, networking, embedded PCs, 
critical systems, robotics, computer peripherals, wireless data ... 

Keywords: Embedded systems education, curriculum 



8 Evaluation of SystemC Modellin g of Reconfigurable E mb edded S ystems Q 
Tero Rissa, Adam Donlin, Wayne Luk 

March 2005 Proceedings of the conference on Design, Automation and Test in Europe 

- Volume 3 DATE '05 

Publisher: IEEE Computer Society 

Full text available: 'g| pdf(1 11.41 KB) Additional Information: full citation , abstract , index terms 

This paper evaluates the use of pin and cycle accurate SystemC models for embedded 
system design exploration and early software development. The target system is 
MicroBlaze VanillaNet Platform running MicroBlaze uClinux operating system. The paper 
compares Register Transfer Level (RTL) Hardware Description Language (HDL) simulation 
speed to the simulation speed of several different SystemC models. It is shown that 
simulation speed of pin and cycle accurate models can go up to 150 kHz, compared t ... 

9 Integration. Verification and Layout of a Complex Multimedia SQC Q 
Chlen-Liang Chen, Jiing-Yuan Lin, Youn-Long Lin 

March 2005 Proceedings of the conference on Design, Automation and Test in Europe 

- Volume 2 DATE '05 

Publisher: IEEE Computer Society 

Full text available: ^ p6U76Ad KB) Additional Information: full citation , abstract , index terms 

We present our experience of designing a single-chip controller for advanced digital still 
camera from specification all the way to mass production. The process involves 
collaboration with camera system designer, IP vendors, EDA vendors, silicon wafer 
foundry, package & testing houses, and camera maker. We also co-work with academic 
research groups to develop a JPEG codec IP and memory BIST and SOC testing 
methodology. In this presentation, we cover the problems encountered, our solutions, and 
I ... 

^ Area-Performance Trade-offs in Tiled Dataflow Archite ctures B 
Steven Swanson, Andrew Putnam, Martha Mercaldi, Ken Michelson, Andrew Petersen, 
Andrew Schwerin, Mark Oskin, Susan J. Eggers 

May 2006 ACM SIGARCH Computer Architecture News , Proceedings of the 33rd 

annual International symposium on Computer Architecture ISCA '06, volume 

34 Issue 2 . 

Publisher: IEEE Computer Society, ACM Press 

Full text available: ^ pdf(487.22 KB) Additional Information: full citation , abstract , citin gs, index terms 

Tiled architectures, such as RAW, SmartMemories, TRIPS, and WaveScalar, promise to 
address several issues facing conventional processors, including complexity, wire-delay, 
and performance. The basic premise of these architectures Is that larger, higher- 
performance implementations can be constructed by replicating the basic tile across the 
chip. This paper explores the area-performance trade-offs when designing one such tiled 
architecture, WaveScalar. We use a synthesizable RTL model and cycle-le ... 

Keywords: WaveScalar, Dataflow computing, ASIC, RTL 
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Stephen Schmitt, Wolfgang Rosenstiel 

February 2004 Proceedings of the conference on Design, automation and test in 
Europe - Volume 3 DATE '04 

Publisher: IEEE Computer Society 

Full text available: pdf (281 .80 KB ) Additional Information: full c ita tio n , ab stract , index terms 

Rapid prototyping is a fast and efficient way for the functional verification of Systems-on- 
. a-Chip in an early stage of the design process. Because of the rising part of software in 
those systems the use and reuse of microcontroller IP cores is necessary to keep 
development cycles short. Today, prototyping of such IP cores is done with large and 
expensive hardware emulation machines consisting of many processor or FPGA-based 
prototyping boards. In this paper the authors describe an alternative p ... 

2 S pecial issue: dasCMP'05: A chip prototyp in g substrate: the flexible architecture for | 
simulation and testin g (FAST) 

John D. Davis, Stephen E. Richardson, Charis Charitsis, Kunle Olukotun 
November 2005 ACi^ SIGARCH Computer Arcliitecture News, volume 33 issue 4 

Publisher: ACM Press 

Full text available: ^ pdf(333.79 KB ) Additional Information: full citation , abstract , references , index terms 

We describe a hybrid hardware emulation environment: the Flexible Architecture for 
Simulation and Testing (FAST). FAST Integrates field-programmable gate arrays (FPGAs), 
microprocessors, and memory to enable rapid prototyping of chip multiprocessors, 
multithreaded architectures, or other novel computer architectures and chip-level memory 
systems. FAST combines configurable and fixed-function hardware and software to 
facilitate rapid prototyping by utilizing components optimized for their particu ... 



Instruction Set Emulation for Rapid Prototyping of SoCs | 

Jurgen Schnerr, Gunter Haug, Wolfgang Rosenstiel 

March 2003 Proceedings of tlie conference on Design, Automation and Test in Europe 
- Volume 1 DATE '03 

Publisher: IEEE Computer Society 

Full text available: 'gl pdf(129.16 KB) 

.sT Additional Information: full citation , abstract , index terms 
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In this paper the application of Instruction Set Emulation for rapid prototyping of SoCs will 
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be presented. The emulation works in a way that both the software and the hardware 
behaviour of the emulated processor core is reproduced cycle accurately. This requires 
the use of hardware and software components. The hardware component consists of a 
board containing a VLIW processor and FPGAs. The software component is an instruction 
set simulator of the core running on the VLIW processor. The FPGAs a ... 

FPGA-based computin g : A practical FPGA-based framework for novel CMP research 
Sewook Wee, Jared Casper, Njuguna Njoroge, Yuriy Tesylar, Daxia Ge, Christos Kozyrakis, 
Kunle Olukotun 

February 2007 Proceedings of the 2007 ACM/SIGDA 15th international symposium on 
Field programmable gate arrays FPGA '07 

Publisher: ACM Press 

Full text available: 'p!| pdf(621.81 KB ) Additional Information: fu ll citation , abstract , references , index terms 

Chip-nnultlprocessors are quickly gaining momentum in all segments of computing. 
However, the practical success of CMPs strongly depends on addressing tlie difficulty of 
multithreaded application development. To address this challenge, it is necessary to co- 
develop new CMP architecture with novel programming models. Currently, architecture 
research relies on software simulators which are too slow to facilitate interesting 
experiments with CMP software without using small datasets or significantly ... 

Keywords: FPGA-based emulation, chip multi-processor, transactional memory 
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^ in hardware g enerated from C descriptions tar g etin g FPGAs 

Alex Jones, Prith Banerjee 

February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 
symposium on Field programmable gate arrays FPGA '03 

Publisher: ACM Press 

Full text available: 'Q pdf(187.05 KB) Additional Information: full citation , abstract 

Use of hand optimized Intellectual Property (IP) logic cores is prolific in hardware design. 
While IP cores remain a standard way to utilize the improvement in FPGA technology and 
contend with time to market pressure through reuse, popularity of tools generating 
hardware descriptions from high-level languages is also increasing in popularity. PACT 
HDL combines these two methods within a power-aware framework. The PACT HDL 
compiler generates power optimized VHDL/Verilog from a C language descript ... 

6 Exploiting FPGA-features during the emulation of a fast reactive embedded system Q 
Karlheinz WeiB, Thorsten Steckstor, Gemot Koch, Wolfgang Rosenstiel 
February 1999 Proceedings of the 1999 ACM/SIGDA seventh international symposium 

on Field programmable gate arrays FPGA '99 
Publisher: ACM Press 

Full text available: "g] pdf(2.02 MB) Additional Information: full citation , references , citings , index terms 
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February 2004 Proceedings of the conference on Design, automation and test in 
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Publisher: IEEE Computer Society 
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Today, having a fast hardware platform for SoC software development prior to silicon is 
an important challenge to gain the time-to-market. The FPGAs offer an excellent 
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prototyping basis for building fiardware platforms since more than ten years ([1]). 
However, as the circuit complexity Increases and project time-frames shrink, building a 
multi-FPGA prototype represents a real challenge from the complexity viewpoint. The 
paper describes the state-of-the-art mapping methodology, prototyping tools a 

8 Inn plementation and emulation: An FPGA-based Pentium® in a complete desktop 
<^ system 

^ Shih-Llen L. Lu, Peter Yiannacouras, Rolf Kassa, Michael Konow, Taeweon Suh 

February 2007 Proceedings of the 2007 ACM/SIGDA 15th international symposium on 

Field programmable gate arrays FPGA '07 
Publisher: ACM Press 

Full text available: 'g| pdf(1 95.80 KB) Additional Information: full citation , abstract , references , index terms 

Software simulation has been the predominant method for architects to evaluate 
microprocessor research proposals. There are three tenets in modeling new designs with 
software models: simulation speed, model accuracy and model completeness. The 
increasing complexity of the processor and accelerated trend to have multiple processors 
on a chip are putting burden on simulators to achieve all tenets mentioned, including 
accurately capturing OS effects. In this work we perform preliminary experimental ... 

Keywords: FPGA, accelerator, emulator, pentium®, processor 



9 F P G A -based systems: An SoC design methodolo g y usin g FPGAs and ennbedded 

microprocessors 
Nobuyukl Ohba, Kohjl Takano 

June 2004 Proceedings of the 41st annual conference on Design automation DAC '04 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citing s, index 



Full text available: ' , 

In System on Chip (SoC) design, growing design complexity has forced designers to start 
designs at higher abstraction levels. This paper proposes an SoC design methodology that 
makes full use of FPGA capabilities. Design modules In different abstraction levels are all 
combined and run together in an FPGA prototyping system that fully emulates the target 
SoC. The higher abstraction level design modules run on microprocessors embedded in 
the FPGAs, while lower-level syntheslzable RTL design module ... 

Keywords; ASIC, FPGA prototyping, SoC, mixed-level verification 



''O C ycle Accurate Binary Translation for Simulation Acceleration in Rapid Protot yping of Q 
SoCs 

Jurgen Schnerr, Oliver Bringmann, Wolfgang Rosenstiel 

March 2005 Proceedings of the conference on Design, Automation and Test in Europe 
- Volume 2 DATE '05 

Publisher: IEEE Computer Society 

Full text available: 'g| pdffl 38.92 KB^ Additional Information: full citation , abstract, index terms 

In this paper, the application of a cycle accurate binary translator for rapid prototyping of 
SoCs will be presented. This translator generates code to run on a rapid prototyping 
system consisting of a VLIW processor and FPGAs. The generated code is annotated with 
information that triggers cycle generation for the hardware in parallel to the execution of 
the translated program. The VLIW processor executes the translated program whereas 
the FPGAs contain the hardware for the parallel cycle genera ... 
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Test: Automatic generation of test sets for SBST of microprocessor IP cores 
E. Sanchez, M. Reorda Reorda, G. Squillero, M. Violante 

September 2005 Proceedings of the 18th annual symposium on Integrated circuits 
and system design SBCCI '05 

Publisher: ACM Press 

Full text available: pdf(258.50 KB) Additional Information: full citation , abstract , references , index terms 

Higher integration densities, smaller feature lengths, and other technology advances, as 
well as architectural evolution, have nnade nnicroprocessor cores exceptionally complex. 
Currently, Software-Based Self-Test (SBST) is becoming an attractive test solution since it 
guarantees high fault coverage figures, runs at-speed, and matches core test 
requirements while exploiting low-cost ATEs. However, automatically generating test 
programs is still an open problem. This paper presents a novel approach ... 

Keywords: FPGA, automatic test generation, hardware accelerator, microprocessor test, 
pipelined architectures, test programs 




12 Ubiquitous Access to Reconfigurable Hardware: Application Scenarios and 
Implementation Issues 

Leandro Scares Indrusial<, Florian Lubitz, Ricardo Reis, Manfred Glesner 
* l\/larch 2003 Proceedings of the conference on Design, Automation and Test in Europe 
- Volume 1 DATE '03 
Publisher: IEEE Computer Society 

pdfd 59.86 KB) 

^Additional Information: full citation , abstract , citings , index terms 
Publisher Site 

This paper presents an approach for the integration of reconfigurable hardware and 
computer applications based on the concept of ubiquitous computing. The goal is to allow 
a network of reconfigurable hardware modules to be transparently accessible by client 
applications. The communication between them is done at the API level, and a Jini-based 
infrastructure is used to provide an interface for the client applications to find available 
reconfigurable hardware modules over the network. A DES-based ... 

13 Balancing performance and flexibility with hardware support for network architectures 
Ilija Hadzic, Jonathan M. Smith 

November 2003 ACi^ Transactions on Computer Systems (TOCS), volume 21 issue 4 
Publisher: ACM Press 

Full text available: 'g| pdf(719.03 KB) Additional Information: full citation , abstract , references , index terms 

The goals of performance and flexibility are often at odds in the design of network 
systems. The tension is common enough to justify an architectural solution, rather than a 
set of context-specific solutions. The Programmable Protocol Processing Pipeline (P4) 
design uses programmable hardware to selectively accelerate protocol processing 
functions, A set of field-programmable gate arrays (FPGAs) and an associated library of 
network processing modules implemented in hardware are augmented with so ... 

Keywords: FPGA, P4, computer networking, flexibility, hardware, performance, 
programmable logic devices, programmable networks, protocol processing 
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FPGAs are being used in increasingly complex roles in critical systems, interacting with 
conventional critical software. Established safety standards require rigorous justification of 
safety and correctness of the conventional software in such systems. Newer standards 
now make similar requirements for safety- related electronic hardware, such as FPGAs, in 
these systems. In this paper we examine the current state-of-the-art in programming 
FPGAs, and their use in conventional (low-criticality) hard ... 

15 Reconfigurable computing: architectures and applications: Using reconfigurability to 
^ achieve real-time profilin g for hardware/software codesi gn 
^ Lesley Shannon, Paul Chow 

February 2004 Proceedings of the 2004 ACM/SIGDA 12th international symposium on 
Field programmable gate arrays FPGA '04 

Publisher: ACM Press 

Full text available- "pi pdf(228 02 KB) ^^^'^'^^^^ Information: full citation , abstract , references , citing s, index 
|Ai terms 

Embedded systems combine a processor with dedicated logic to meet design 
specifications at a reasonable cost. The attempt to amalgamate two distinct design 
environments introduces many problems, one being how to partition a single design for 
the two platforms to achieve the best performance with the least effort. Since the latest 
FPGA technology allows the integration of soft or hard CPU cores with dedicated logic on a 
single chip, this presents new opportunities for addressing hardware/software ... 

Keywords: FPGA, embedded processor, hardware/software codesign, performance 
measurement, profiling, soft processor 



Virtual Hardware Prototyping through Timed Hardware-Software Co-Simulation 
Franco Fummi, Mlrko Loghi, Stefano Martini, Marco Monguzzi, Giovanni Perbellini, Massimo 
Poncino 

March 2005 Proceedings of the conference on Design, Automation and Test in Europe 
- Volume 2 DATE '05 

Publisher: IEEE Computer Society 

Full text available: ^ pdf(208.94 KB ) Additional Information: full citation , abstract , citin gs, index terms 

Designers of factory automation applications increasingly demand for tools for rapid 
prototyping of fiardware extensions to existing systems and verification of resulting 
behaviors through hardware and software co-simulation. This work presents a framework 
for the timing-accurate co-simulation of HDL models and their verification against 
hardware and software running on an actual embedded device of which only a minimal 
knowledge of the current design is required. Experiments on real-life applicat ... 

Poster session: Makin g area-performance tradeoffs at the high level using the 
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R. Anderson 

February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 

symposium on Field programmable gate arrays FPGA '03 
Publisher: ACM Press 

Full text available: ^ pdfd 87.05 KB) Additional Information: full citation , abstract 

Applications such as digital cell phones, 3G wireless receivers, and voice over IP, require 
DSP functions that are typically mapped onto general purpose DSP processors. With the 
introduction of advanced FPGA architectures which provide built-in DSP support such as 
the Xilinx Virtex-II, and the Altera Stratix, a new hardware alternative is available for DSP 
designers. DSP design has traditionally been divided into algorithm development and 
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hardware/software implementation. The majority of DSP alg ... 

'1 8 Poster session: A hi g h resolution diagnosis technique for open and short defects in Q 

FPGA interconnects 
^ Mehdi Baradaran Tahoori 

February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 
symposium on Field programmable gate arrays FPGA '03 

Publisher ACM Press 

Full text available: ^ pdf(187.05 KB) Additional Information: full citation, abstract 

A two-step diagnosis flow, coarse-grain and fine-grain, is presented in order to identify a 
faulty element in the FPGA interconnects. The fault models used for interconnect are 
open, resistive-open, and bridging fault. The coarse-grain phase identifies the faulty net, 
the routing between two consecutive sequential elements in the FPGA. This phase is 
performed by just post-processing tester results for the test configurations used for 
Interconnect testing. During the fine-grain step, the faulty n ... 



19 Poster session: A single-FPGA implementation of image connected component | 
labellin g 

K. Benkrid, S. Sukhsawas, D. Crookes, S. Belkacemi 
February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 

symposium on Field programmable gate arrays FPGA '03 
Publisher: ACM Press 

Full text available: "gl pclfCI 87.05 KB) Additional Information: full citation , abstra ct 

This paper describes an architecture based on a serial iterative algorittim for Image 
Connected Component Labelling with a hardware complexity 0(N) for an NxN image. The 
algorithm iteratively scans the input image, performing a recursive non-zero maximum 
neighbourhood operation. A complete forward pass is followed by an inverse pass in which 
the image Is scanned in reverse order. The process is repeated until no change in the 
image occurs. The algorithm has been coded in Handel C language and tar ... 

20 Poster session: FPGA-based design of an evolutionary controller for collision-free 

robot navigation 
M. A. H. B. Azhar, K. R. Dimond 

February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 

symposium on Field programmable gate arrays FPGA '03 
Publisher: ACM Press 

Full text available: ' gpdf(1 87.05 KB) Additional Information: full citation , abstract 

The employment of field programmable gate arrays (FPGAs) to a robot controller is very 
attractive, since it allows for fast IC prototyping and low cost modifications. The speedup 
is achieved because of pipelining and dedicated functions in hardware that are customized 
to the problem. The self learning ability and the adaptive nature of an Artificial Neural 
Network (ANN) makes it a good candidate for the control structure of a robot's navigation. 
An evolutionary approach in designing robots can e ... 
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